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DETAILED ACTION 

Response to Arguments 

1 . Applicant's arguments filed 02/04/1 0 have been fully considered but they are not 

persuasive. 

Applicant argues that neither Schroder, nor Kaufholz nor Kataoka nor Rajan 
teach that subsequent utterance originating from the second position will be discarded 
if not preceded by the recognition of the predetermined keyword originating from the 
second position (Amendment, pages 12-18). 

The examiner disagrees, since Schroder et al., disclose that "an operator-control 
command which, after its input by the first user, allows voice commands from a 
second user to be accepted may be advantageously provided. It is checked 
whether the speech input was by the user already previously noted in method 
step 10. If this is the case, the input command for controlling the voice-controlled 
system is used in method step 8, for example for menu control or navigation. The 
user can carry out operator control from any desired place in the room without taking 
along the remote control unit" (determining whether the speech input was previously 
used, and if yes use the speech input for menu control or navigation suggest discarding 
non predetermined keyword originating from the second position; col.2, lines 39 - 44; 
col.3, lines 49 - 52; col.1, lines 44 - 47). 
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Applicant argues that neither Schroder, nor Kaufholz nor Kataoka nor Rajan 
teach utterances of other users at other position are discarded (Amendment, pages 15 
-18). 

The examiner disagrees, since Rajan discloses that "the computer system 7 is 
also arranged to process the signals from each of the microphones In order to separate 
the speech signals from each of the users 1-1,1-2 and 1-3. A system has been 
described above which can separate the speech from multiple users even when they 
are speaking together. As those skilled in the art will appreciate, the system can be 
used to separate any mix of acoustic signals from different sources. For example, 
If there are a number of users playing musical instruments, then the system may be 
used to separate the music generated by each of the users. This can then be used 
in various music editing operations. For example it can be used to discard 
("remove") one or more of the musical instruments from the soundtrack" 
(paragraphs 22, and 61). 

Applicant argues that neither Schroder, nor Kaufholz nor Kataoka nor Rajan 
teach discriminate between sounds originating from users who are located in front of 
each other relative the microphone array (Amendment, pages 15-18). 

The examiner disagrees, since Rajan discloses " the computer system 7 Is also 
arranged to process the signals from each of the microphones in order to ("separate") 
the speech signals from each of the users 1-1, 1-2 and 1-3" (users 1-1, and 1 - 3 are 
located in front of each other relative to the microphone array; paragraph 22). 
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Claim Rejections - 35 USC § 103 

2. The text of those sections of Title 35, U.S. Code not included in this action can 
be found in a prior Office action. 

3. Claims 1- 10,15, and 20 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Schroder et al. (US Patent 7,136,817) in view of Kaufholz (US Patent 
7,050,971), and further in view of Rajan (US PAP 2002/0150263). 

Regarding claims 1 and 9, Schroder et al. discloses a speech control unit for 
controlling an apparatus on basis of speech, comprising: 

a microphone array, comprising multiple microphones for receiving respective 
audio signals (see col. 4, lines 44 - 46); and 

a speech recognition unit for creating an instruction for the apparatus based on 
recognized speech items of the speech signal (see col. 4, lines 60-62, where the 
commands are recognized speech items), and a keyword recognition system for 
recognition of a predetermined, keyword that is spoken by the user and which is 
represented by a particular audio signal and the speech control unit being arranged to 
control the beam forming module (see col. 4, lines 60 - 62, where the commands are 
the predetermined keywords spoken), on basis of the recognition of the predetermined 
keyword, in order to enhance second components of the audio signals which represent 
a subsequent utterance originating from a second orientation of the user relative to the 
microphone array (see col. 2, lines 38 - 44); 



Application/Control Number: 10/532,469 Page 5 

Art Unit: 2626 

wherein the recognition of the predetermined keyword at the second orientation 
so that the subsequent utterance originating from the second orientation are accepted 
("The input command for controlling the voice-controlled system is used in method step 
8, for example for menu control or navigation"; col.2, lines 39 - 44, col.3, lines 49 - 52); 

wherein the subsequent utterance originating from the second orientation will be 
discarded if not preceded by the recognition of the predetermined keyword originating 
from the second orientation ("The input command for controlling the voice-controlled 
system is used in method step 8, for example for menu control or navigation"; col.2, 
lines 39 - 44, col.3, lines 49 - 52; col.1 , lines 44 - 47). 

Schroder et al. do not disclose a beam forming module for extracting a speech 
signal of a user; calibrates the beam forming module to allow the user from the first 
position to the second position. However this feature is well known in the art as 
indicated by Kaufholz. Kaufholz discloses a speech recognition apparatus that utilizes a 
beam former that creates a higher performance and resolution of the resulting 
microphone signal. The beam former may also select or even tract an audio source. 
Typically, the loudest source signal is identified (see col. 5, lines 8-15). Thus, it would 
have been obvious to one of ordinary skill in the art at the time the invention was .made 
to utilize a beam forming module with the apparatus of Kaufholz for the benefit of a 
higher performance and resolution of the resulting microphone signal. 

However Schroder et al in view of Kaufholz do not specifically teach that 
utterances of other users at other positions are discarded, the second position including 
an orientation and a distance relative to the microphone array, and the speech control 
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unit being configured to discriminate between sounds originating from users who are 
located in front of each other relative the microphone array. 

Rajan discloses that current techniques employ an array of microphones and an 
adaptive beamforming technique in order to discard ("isolate") the speech from one 
of the users. The computer system 7 is also arranged to process the signals from 
each of the microphones in order to discriminate ("separate") the speech signals from 
each of the users 1 -1 , 1-2 and 1 -3 (users 1 -1 , and 1 - 3 are located in front of each 
other). The predetermined curved plots used may be circular arcs, in which case, the 
spectrogram processing module 33 will be able to estimate, not only the orientation 
("direction") from which the speech emanated, but also the distance from the 
microphones of that user (paragraph 2, lines 6-8; paragraph 22, last six lines; 
paragraph 57, last six lines). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to separate the speech signals from each of the users 1-1 , 
1-2 and 1-3 as taught by Rajan in Schroder et al in view of Kaufholz, because that 
would help effectively identifies the speech source Q) from which the 
corresponding signal value has been received (paragraph 45). 

Regarding claim 2, Schroder et al. further disclose that the keyword recognition 
system is arranged to recognize the predetermined keyword that is spoken by another 
user and the speech control unit being arranged to control the beam forming module, on 
basis of this recognition, in order to enhance third components of the audio signals 
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which represent another utterance originating from a third position of the other user 
relative to the microphone array (see col. 2, lines 35-44). 

Regarding claim 3, Schroder et al. further disclose that a first one of the 
microphones of the microphone array is arranged to provide the particular audio signal 
to the keyword recognition system (see col. 4, lines 56-62). 

Regarding claim 4, Schroder et al. further disclose that the beam forming module 
is arranged to determine a first position of the user relative to the microphone array (see 
col. 4, lines 51-56). 

Regarding claim 5, Schroder et al. further disclose that an apparatus comprising: 
a speech control unit for controlling the apparatus on basis of speech as claimed in 
claim 1 (see col. 4, lines 60-62); and 

processing means for execution of the instruction being created by the speech 
control unit (see col. 4, lines 60-62). 

Regarding claim 6, Schroder et al. discloses an apparatus as claimed in claim 5, 
characterized in being arranged to show that the predetermined keyword has been 
recognized (see fig. 1 , col. 3, lines 32- 45). 
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Regarding claim 7, Scliroder et al. discloses an apparatus as claimed in claim 6, 
characterized in comprising audio generating means for generating an audio signal in 
order to show that the predetermined keyword has been recognized (see fig. 1 , col. 3, 
lines 32-45). 

Regarding claim 8, Schroder et al. discloses a consumer electronics system 
comprising the apparatus as claimed in claim 5 (see col. 4, lines 63-65). 

As per claims 1 0, and 1 5, Kaufholz further discloses that the user is informed by 
indications that the speech control unit is not active, is in active state and ready to 
receive the utterance or is in a state of calibration ("the controller can also check which 
part is active at the moment of receiving input from the user"; col.7, lines 42 - 54). 

As per claim 20, Schroder et al., in view of Kaufholz, and further in view of Rajan 
suggest that the beam forming module is connected to the microphone array, and the 
keyword recognition system is connected to one microphone of the microphone array 
for detecting the predetermined keyword, the keyword recognition system being further 
connected to the beam forming module for providing the detected predetermined 
keyword to the beam forming module (Kaufholz "The apparatus 220 has two 
microphone inputs 224 and 226 for receiving the microphone signals from the 
respective outputs 204 and 214. All microphone signals (in the example two external 
microphone signals and one internal microphone signal) are supplied to a beam former 
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240. The beam former combines the microphone signals, resulting in a higher 
performance and resolution of the resulting microphone signal"; col .4, line 58 - col. 5, 
line 40; see also figs 2, and 3). 

4. Claims 1 1 -14, and 16 - 19 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Schroder et al. (US Patent 7,136,817) in view of Kaufholz (US Patent 
7,050,971), and further in view of Rajan (US PAP 2002/0150263), and further in view of 
Kataoka (US PAP 2002/0181723). 

As per claims 11-14, and 16 -19, Schroder et al., in view Kaufholz, and further in 
view of Rajan do not specifically teach that indications include an animal in a sleeping 
state indicating inactive state or in an awake state indicating active state; wherein the 
progress of the active state is indicated by angle of ears of the animal; wherein the ears 
are fully raised at a beginning of the active state, and fully down at an end of the active 
state; wherein the animal has an understanding look when the utterance is recognized 
and a puzzled look when the utterance is not recognized. 

Kataoka discloses that the direction of the targeted voice then can be inputted to 
the servo system, whereby a face, eyes, an upper body, or the like of the robot can 
controlled accordingly (paragraph 38, last five lines); but Kataoka does not teach 
active and inactive states of the speech control unit based on indications states of an 
animal. However, since Kataoka disclose that the robot may take a form of an 
animal such as a mouse, a dog, a cat, or the like... after all, it is satisfactory so far 
as the robot has capability of the posture control, head motion or eye direction 
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shifts toward the direction of the sound source (paragraphs 36, last five lines; 
paragraph 38, last four lines). One having ordinary skill in the art at the time the 
invention was made would have it found obvious to indicate different states through an 
animal in Kataoka, so that voice recognition can be performed with an input of a delay 
sum corresponding to the directivity direction (Abstract, last two lines). 



Conclusion 

5. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to LEONARD SAINT CYR whose telephone number is 
(571 ) 272-4247. The examiner can normally be reached on Mon- Friday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone 
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number for the organization where this application or proceeding is assigned is (571)- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or (571)-272-1000. 
LS 

04/28/10 

/Richemond Dorvil/ 

Supervisory Patent Examiner, Art Unit 2626 



